A Poissonian Model of Indel Rate Variation for Phylogenetic Tree Inference.
نویسندگان
چکیده
While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation. [evolutionary stochastic process; indel rate variation; Poisson indel process; TKF91.].
منابع مشابه
BayesCAT: Bayesian co-estimation of alignment and tree.
Traditionally, phylogeny and sequence alignment are estimated separately: first estimate a multiple sequence alignment and then infer a phylogeny based on the sequence alignment estimated in the previous step. However, uncertainty in the alignment is ignored, resulting, possibly, in overstated certainty in phylogeny estimates. We develop a joint model for co-estimating phylogeny and sequence al...
متن کاملPhylogenetic Profiling of Insertions and Deletions in Vertebrate Genomes
Micro-indels are small insertion or deletion events (indels) that occur during genome evolution. The study of micro-indels is important, both in order to better understand the underlying biological mechanisms, and also for improving the evolutionary models used in sequence alignment and phylogenetic analysis. The inference of micro-indels from multiple sequence alignments of related genomes pos...
متن کاملStochastic Evolutionary Model for Protein Structure Alignment and Phylogeny
We present a stochastic process model for the joint evolution of protein primary and tertiary structure, suitable for use in alignment and estimation of phylogeny. Indels arise from a classic Links model and mutations follow a standard substitution matrix, while backbone atoms diffuse in three-dimensional space according to an OrnsteinUhlenbeck process. The model allows for simultaneous estimat...
متن کاملEvolutionary inference via the Poisson Indel Process.
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124]...
متن کاملPhylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data
The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Systematic biology
دوره 66 5 شماره
صفحات -
تاریخ انتشار 2017